2 research outputs found

    Thompson Sampling with Virtual Helping Agents

    Full text link
    We address the problem of online sequential decision making, i.e., balancing the trade-off between exploiting the current knowledge to maximize immediate performance and exploring the new information to gain long-term benefits using the multi-armed bandit framework. Thompson sampling is one of the heuristics for choosing actions that address this exploration-exploitation dilemma. We first propose a general framework that helps heuristically tune the exploration versus exploitation trade-off in Thompson sampling using multiple samples from the posterior distribution. Utilizing this framework, we propose two algorithms for the multi-armed bandit problem and provide theoretical bounds on the cumulative regret. Next, we demonstrate the empirical improvement in the cumulative regret performance of the proposed algorithm over Thompson Sampling. We also show the effectiveness of the proposed algorithm on real-world datasets. Contrary to the existing methods, our framework provides a mechanism to vary the amount of exploration/ exploitation based on the task at hand. Towards this end, we extend our framework for two additional problems, i.e., best arm identification and time-sensitive learning in bandits and compare our algorithm with existing methods.Comment: 14 pages, 8 figure

    Enhancing Cybersecurity of Unmanned Aircraft Systems in Urban Environments

    No full text
    The use of lower airspace for air taxi and cargo applications opens up exciting prospects for futuristic Unmanned Aircraft Systems (UAS). However, ensuring the safety and security of these UAS within densely populated urban areas presents significant challenges. Most modern aircraft systems, whether unmanned or otherwise, rely on the Global Navigation Satellite System (GNSS) as a primary sensor for navigation. From satellite navigations point of view, the dense urban environment compromises positioning accuracy due to signal interference, multipath effects, etc. Furthermore, civilian GNSS receivers are susceptible to spoofing attacks since they lack encryption capabilities. Therefore, in this thesis, we focus on examining the safety and cybersecurity assurance of UAS in dense urban environments, from both theoretical and experimental perspectives.  To facilitate the verification and validation of the UAS, the first part of the thesis focuses on the development of a realistic GNSS sensor emulation using a Gazebo plugin. This plugin is designed to replicate the complex behavior of the GNSS sensor in urban settings, such as multipath reflections, signal blockages, etc. By leveraging the 3D models of the urban environments and the ray-tracing algorithm, the plugin predicts the spatial and temporal patterns of GNSS signals in densely populated urban environments. The efficacy of the plugin is demonstrated for various scenarios including routing, path planning, and UAS cybersecurity.  Subsequently, a robust state estimation algorithm for dynamical systems whose states can be represented by Lie Groups (e.g., rigid body motion) is presented. Lie groups provide powerful tools to analyze the complex behavior of non-linear dynamical systems by leveraging their geometrical properties. The algorithm is designed for time-varying uncertainties in both the state dynamics and the measurements using the log-linear property of the Lie groups. When unknown disturbances are present (such as GNSS spoofing, and multipath effects), the log-linearization of the non-linear estimation error dynamics results in a non-linear evolution of the linear error dynamics. The sufficient conditions under which this non-linear evolution of estimation error is bounded are derived, and Lyapunov stability theory is employed to design a robust filter in the presence of an unknown-but-bounded disturbance. </p
    corecore